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An  Access  Path  Specification  Language 
for  Restructuring  Network  Databases 

by 

Donald  Swartwout 

University  of  Michigan 
Center  for  Database  Systems  Research 


ABSTRACT 

NJ 

Current  approaches  to  restructuring  are  based  on  the  hierarchical  data 
model  and  use  specific  operations  to  achieve  the  required  transformations.  V 
Our  approach  to  restructuring  the  class  of  network  databases  has 
three  principal  characteristics: 

l)  Data  model  - The  Relational  Interface  Model  (RIM)  permits  the 
database  to  be  viewed  simultaneously  as  a network  database  and 
a relational  database  in  first  normal  form. 


(2^  Restructuring  System  which  can  perform  the  operations  of  the  rela- 
tional algebra  but  is  not  restricted  to  any  data  model  dependent 
operations.  It  can  perform  in  a simple  efficient  manner,  user- 
required  network  transformations  such  as  link  record  processing. 

3,1  Access  Path  Specification  Language  - APSL  - is  a verifiably  power- 
ful, structurally  simple,  and  descriptive  language. 

The  access  path  approach  permits  the  specification  of  complex  restruc- 
turing transformations  in  terms  of  appl ication-oriented  concepts  such  as 
access  strategies  and  selection  criteria.  A high-level  Access  Path 
Restructuring  Language  (APSL)  based  on  this  approach  is  presented,  and 
an  example  of  its  use  in  restructuring  is  given. 


Key  Words  and  Phrases:  database  restructuring,  data  translation,  network 
restructuring,  network  databases,  restructuring  software,  data  translation 
software,  restructuring  languages,  data  translation  language,  translation 
specification  languages. 
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1.6  INTRODUCTION 

The  currently  expanding  use  of  very  large  databases  in  business. 
Industrial,  and  governmental  applications  has  created  a need  for  a 
generalized  reorganization  capability  for  three  principal  reasons. 

First,  the  lack  of  a systeratic  database  design  methodology  [DL23]  re- 
sults in  poorly  designed  and  consequently  poorly  performing  databases. 
Secondly,  the  lack  of  an  adequate  requirements  methodology  to  precisely 
define  user  needs  coupled  wi th  the  inability  of  users  to  formulate  require- 
ments beyond  the  present  activities  results  in  an  outdated  system  design. 

The  third  major  reason  is  the  currently  changing  technology,  New  hardware 
advances  and  software  capabilities  are  being  made  available  at  an  ever 
increasing  rate. 

To  address  this  problem,  new  technology  is  being  developed  for  the 
migration  of  data  and  software  between  environments,  or  the  reorganization 
of  data  within  an  environment  [DT4 ,DT9,DT10,R3,R5,R9].  At  the  University 
of  Michigan,  a series  of  increasingly  general  data  translators  has  been 
implemented  over  the  past  four  years  to  support  the  migration  and  reorgani- 
zation of  data  [DTI ,DT3, Dill ].  As  indicated  in  Figure  1-1  the  overall 
architecture  of  a data  translator  consists  of  three  major  functional 
modules  - Reader,  Writer,  and  Restructurer.  The  major  thrust  of  the  current 
version  of  the  Michigan  Data  Translator  is  the  development  of  a comprehensive 
set  of  reorganization  capabilities  for  IDS/I,  a DBTG-like  database  manage- 
ment system. 
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Important  features  of  the  design  are: 
o Complete  data  descriptions  of  the  user's  (currently  existing) 
source  database,  and  (proposed)  target  database  are  obtained 
using  an  augmented  IDS  data  description, 
o These  descriptions  are  adequate  for  the  Reader  to  create  the 
translator’s  internal  form  of  the  source  database, 
o A high-level  restructuring  language  has  been  developed  to  specify 
the  restructuring  of  a source  database  into  a target  database, 
o The  execution  of  the  process  is  automatic,  based  entirely  on  these 
descriptions. 

It  is  the  intent  of  this  paper  to  present  our  approach  to  and  develop- 
ment of  a high-level  language  to  specify  restructuring  transformations  for 
the  Michigan  Data  Translator.  Basically,  the  goal  of  a restructuring 
transformation  is  to  change  the  logical  structure  of  a database  in 
response  to  new  information  or  processing  requirements.  Navathe  and  Fry 
[R2]  provide  a categorization  of  restructuring  capabilities  for  the  hierar- 
chical class  of  logical  structures  and  also  develop  several  fundamental 
restructuring  operations.  In  this  paper,  we  broaden  the  scope  of  restruc- 
turing capabilities  to  the  class  of  network  logical  structures  including 
several  "less  theoretical"  transformations  which  users  need.  To  achieve 
this  we  present  a data  model  general  enough  to  support  multiple  user  and  * 
implementation  views;  we  develop  an  access  path-oriented  language  for 
specifying  restructuring  transformations ; and  we  describe  a simple  restructuring 
strategy  to  perform  the  necessary  data  manipulations. 

This  section  is  completed  by  reviewing  the  current  approaches  to 
restructuring  and  developing  the  requirements  for  a generalized  restruc- 
turer.  In  Section  2 we  discuss  the  approach  and  basic  research.  Section 
3 presents  the  access  path  restructuring  language  and  provides  an  example 
of  its  application;  Section  4 contains  our  conclusions. 

1 . 1 Current  Resea rch 

It  is  interesting  to  observe  that  the  large  majority  of  the  develop- 
ment of  a restructuring  technology  has  not  taken  place  in 
the  development  of  database  management  systems,  but  rather  in  the  context 
of  the  data  translation  systems.  In  addition  to  the  work  at  Michigan,  two 
other  efforts  have  addressed  the  problem  of  specifying  restructuring 
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transformations  using  a methodology  similar  to  the  one  developed  in  Figure 
1-1.  CONVERT,  a high-level  restructuring  language  developed  at  the  IBM 
Research  Laboratory  [R3]  provides  a powerful  set  of  restructuring  opera- 
tions based  on  hierarchic  data  model  called  a Form.  This  is  a two- 
dimensional  representation  of  hierarchic  data  which  reflects  not  only  the 
schema  but  also  the  data  instances.  The  headings  to  the  form  are  the 
schema  constructs  - record  types  and  items;  relationships  are  maintained 
through  the  key  item  of  the  parent  record  type.  The  restructuring  trans- 
formations are  based  on  a set  of  Form  operators  which  operate  on  one  or 
more  Forms  (or  their  components)  resulting  in  a new  Form.  With  the 
exception  of  the  assignment  and  CASE,  all  Form  operations  can  be  nested. 

An  extensive  repertoire  of  Form  operators  ranging  from  Form  manipulation 
(e.g.  MERGE,  GRAFT)  through  built-in  functions  (e.g.  SUM,  COUNT)  to  the 
CASE  statement  have  been  developed. 

Another  specific  data  model  operations  approach  is  the  work  at 
System  Development  Corporation  reported  by  Shoshani  [R5].  Instead  of  the 
Form  data  model  this  approach  uses  a standard  hierarchic  data  model  and 
assumes  that  the  source  and  target  database  have  been  defined  in  terns  of 
this  data  model.  Eleven  ."conversion  operations"  are  defined  to  describe 
the  source  to  target  assigaments.  These  operations  ranqe  from  the 
DIRECT  function  which  provides  a one-to-one  assignment  of  source  items  to 
target  items  through  INVERSION  which  causes  the  parent/dependent  record 

% 

type  relationship  operation  to  be  inverted. 

An  obvious  advantage  of  the  data  model  operations  approach  is  that  the 
resultant  software  system  is  inherently  simple  and  a set  of  low  level 
subroutines  can  be  built  to  perform  these  elementary  operations.  Un- 
fortunately, the  user,  although  involved  at  a high  level,  still  views 
restructuring  essentially  as  a sequence  of  low  level  steps.  Thus  he  is 
not  shielded  from  many  of  the  details  of  restructuring. 

In  both  cases  the  restructuring  is  defined  in  terms  of  a particular 
data  model,  implying  that  the  data  must  be  converted  to  this  form  in 
order  to  be  restructured.  Although  the  hierarchic  data  model  facilitates 
the  development  of  restructuring  transformations,  it  is  not  as  general 
as  the  more  powerful  Relational  and  Network  data  models.  Thus,  restruc- 
turing these  more  complex  structures' becomes  very  difficult  in  CONVERT 
and  perhaps  impossible  in  CDTL. 
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In  general  the  low  level  operations  approach  works  well  on  simple 
small  logical  substructures  but  becomes  increasingly  more  complex  in 
direct  proportion  to  the  size  and  complexity  of  the  source  structure. 
Furthermore,  the  nore  complex  the  structure,  the  greater  the  possibility 
that  it  will  contain  substructures  that  are  not  restructurable.  Hence 
we  found  it  difficult  to  generalize  the  elementary  operations  approach 
to  our  problem  - the  network  class  of  loqical  structures. 

1 . 2 Requ i remen ts  for  Res true turing  Network  Database 

In  reviewing  the  current  approaches  to  hierarchical  restructuring 
operations  and  the  extended  capabilities  necessary  for  network  structures 
we  find  the  following  requirements  are  necessary. 

o The  data  model  should  be  general  enough  to  accomodate  as  many 
existing  data  models  (network,  relational,  and  hierarchic)  as 
possible,  while  at  the  same  time  providing  a basis  for  a practical 
restructuring  algorithm.  It  should  also  provide  the  restructuring 
user  with  as  many. views  of  the  database  as  possible.  Further,  it 
should  provide  the  basis  for  a small  set  of  constructs,  simple 
enough  to  be  easily  understood  by  users,  and  readily  analyzed  by 
the  implementors  of  the  restructuring  system. 

o The  restructuring  system  should  be  provably  powerful,  but  it  should 
not  be  limited  to  the  implementation  of  a set  of  operations 
which  manipulate  a specific  data  model.  Its  restructuring  algorithm 
should  be  as  simple  as  possible,  for  obvious  reasons  of  efficiency, 
verification,  and  debugging. 

o The  1 anguage  in  which  restructuring  specifications  are  written 
should  be  based  on  a set  of  constructs  which  are  simple  and  easily 
understood,  but  also  general  enough  to  accommodate  the  multiple  views 
of  data  supported  by  the  restructuring  data  model.  In  particular. 

It  should  not  be  tied  to  any  specific  set  of  data-model -dependent 
operations.  It  should  provide  the  capability  to  specify  the  less 
formal  restructur ing  transformations  such  as  adding  an  indexing 
set.  Finally,  the  full  power  of  the  restructuring  system  should 
be  readily  available;  all  restructuring  speci fications  should  have 
similar  structures,  and  essentially  the  same  level  of  complexity. 
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2.0  TECHNICAL  APPROACH 

Three  main  components  are  necessary  to  develop  a powerful  user- 
oriented  restructuring  capability:  a high-level  language  for  the  speci- 
fication of  restructuring  transformations,  a comprehensive  data  model  which 
Is  supportive  of  the  language  and  capable  of  representing  established  data 
models,  and  an  algorithm  which  is  capable  of  performing  the  transformations 
specified  by  the  language. 

2 . 1 The  Relational  Interface  Model 

Since  none  of  the  existing  data  models  provide  the  capabilities 
needed,  the  Relational  Interface  Model  [R7j  was  developed.  The  Relational 
Interface  Model  (RIM)  is  not  a new  model  but  rather  a synthesis  of  the 
salient  features  of  extant  data  models.  It  is  sufficiently  general  to 
accommodate  the  hierarchical,  network,  and  relational  views  of  data  necessary 
to  represent  the  major  data  models  implemented  in  current  database  manage- 
ment systems.  Furthermore,  it  permits  a database  to  be  viewed  simultaneously 
as  a network  database  and  a relational  database  in  First  Normal  Form. 

Record  types  in  the  network  view  correspond  one-for-one  with  relation 
types  in  the  relational  view,  and  item  types  correspond  to  domains.  Since 
First  Normal  Form  does  not  permit  domains  to  contain  relations,  the  RIM 
prohibits  certain  network  constructs  such  as  naming  groups  and  contained-in- 
repeating  groups. 

In  addition  to  records  and  items,  two  additional  constructs  of  the  net- 
work model  need  to  be  addressed.  Sets  are  used  to  represent  logical  connec- 
tions between  record  types,  and  keys  are  used  to  uniquely  identify  record 
instances.  In  the  RIM,  these  are  implemented  in  a uniform  way:  by  specially 
designated  data  items.  Each  record  type  must  contain  a primary  key;  that 
is,  a collection  of  items  whose  combined  values  uniquely  identify  a record 
Instance  within  its  record  type.  For  example,  in  DBTG  databases,  the  hash 
field  in  a CALC  record  may  serve  as  its  primary  key.  However,  some  record 
types,  such  as  link  records,  may  not  contain  such  a collection  of  items. 

In  order  to  completely  identify  these  records,  the  identities  of  the  record 
Instances  which  own  them  along  certain  sets  must  be  known.  For  example,  in 
the  database  of  Figure  1-2,  an  instance  of  S-C-LINK  is  identified  by  the 
SS # of  the  STUDENT  record  which  owns  it  along  ENROLLED-IN,  and  the  Course/? 
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of  the  COURSE  record  which  owns  it  along  TAKEN-BY.  Thus,  the  primary  keys 
of  certain  record  types  must  include  information  used  to  establish  set 
membership. 


STUDENT  COURSE 


STUDENT  COURSE 


Figure  1-3 


In  the  RIM  data  model,  set  membership  is  established  by  storing,  in 
each  instance  of  the  member  record  type,  a copy  of  the  primary  key  of  the 
appropriate  owner  record  instance.  This  copy  resides  in  a collection  of 
special-purpose  items,  known  as  set-si gni ficant  items.  Each  set-significant 
Item  corresponds  to  one  and  only  one  owner  primary  key  item,  and  a set-signi- 
ficant item  is  used  to  represent  one  and  only  one  set.  It  is  said  to  be 
significant  to  the  set  it  is  used  to  represent.  Figure  1-3  exhibits  the 
database  of  Figure  1-2,  augmented  to  include  set-significant  items.  Set- 
significant  items  are  shown  in  dotted  boxes,  and  set  arrows  point  to  the 
set-significant  items  which  represent  them.  Convenient  names  for  set- 
significant  items  may  be  formed  by  concatenating  the  corresponding  owner 
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primary  key  item  nane  with  the  set  name  enclosed  in  angle  brackets.  Primary  key 
Items  are  underlined.  Notice  that  set-significant  items  maybe  orimary  key  ite-s 
as  wel  1 , as  in  the  S-C-IINK  record  type.  Thus , during  the  construction  of  a RIM  repre- 
sentation for  a network  database,  the  specification  o'  primary  keys  for  sere  record 
types  is  contingent  upon  the  specification  of  certain  set-significant  items, 
which  is,  in  turn,  contingent  upon  the  speci fication  of  primary  keys  for  the 
owner  record  types.  The  possibility  of  infinite  regress  exists,  but  in  our 
experience  it  has  never  occurred,  and  since  IDS  databases  do  not  permit 
cycles  of  sets,  it  is  impossible. 

The  identi fication  and  naming  of  set-significant  items,  and  the 
specification  of  primary  keys  are  performed  within  the  user's  source  and 
target  data  descriptions,  and  are  assumed  for  the  restructuring  specification. 


2 . 2 The  Res t ructurino  System 

Deppe  [R7]  has  shown  that  any  transformation  of  a relational  database 
which  can  be  specified  by  a sequence  of  the  relational  algebra  operations 
Join,  Selection,  Projection,  and  Union  can  be  performed  by  another  sequence 
in  which  all  Joins  are  computed  first,  followed  by  all  Selections,  then 
Projections  and  lastly.  Unions.  He  also  gives  a simple  algorithm  which 
performs  all  such  ordered  sequences  with  minimal  use  of  temporary  storage, 
and  a time  bound  of  0(n),  where  n is  the  number  of  target  record  instances' 

(or  tuples)  to  be  created.  Thus  any  restructuring  system  which  is  capable 
of  executing  Deppe’s  algorithm  has  substantial  power;  it  can  perform  all  the 
operations  of  the  relational  algebra  on  a relational  database. 

The  restructuring  system  developed  for  the  Michigan  Data  Translator 
Is  based  on  the  Deppe  algorithm  and  views  the  data  by  using  the  Relational 
Interface  Model.  It  can  perform  any  restructuring  transformation  which  can 
be  specified  by  a sequence  of  relational  algebra  operations  applied  to 
the  source  data  viewed  from  the  relational  perspective  of  the  RIM.  Thus 
Its  restructuring  power  is  at  least  that  of  the  relational  algebra.  Further- 
more, the  system  is  not  limited  to  relational  operations.  It  can  be  under- 
stood entirely  from  the  network  point  of  view,  and  specifications  for  it  can 
be  written  by  users  with  no  knowledge  of  the  relational  data  model  or 
relational  algebra.  In  addition,  a synthesis  of  the  network  and  relational 
perspectives  is  possible.  To  sort  extent,  the  user  has  the  "best  of  both 
worlds";  he  can  select  the  data  model  best  suited  to  a particular  portion  of 
his  restructuring  transformation. 
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APSL,  the  Access  Path  Specification  Language  used  by  the  Michigan 
Data  Translator  to  encode  restructuring  specifications,  has  evolved  into 
a language  which  satisfies  the  necessary  requirements  for  a restructuring 
specification  language.  In  its  present  form,  ,t  is  a powerful  language; 
the  restructuring  transformations  which  can  be  specified  in  it  have  at 
least  the  power  of  the  relational  algebra.  It  is  an  inherently  simple 
language;  its  fundamental  construct — the  access  path — is  adequate  to 
direct  the  restructuring  system's  retrieval  of  source  data,  but  it  does 
so  without  making  use  of  any  schema  manipulations.  That  is,  APSL  de- 
scribes the  source  structures  frcm  which  target  data  may  be  retrieved, 
not  operations  intended  to  convert  the  source  schema  to  the  target  schema. 
Thus  it  is  not  tied  to  any  specific  set  of  operations  which  manipulate  the 
schema  constructs  of  a particular  data  model.  As  such,  it  is  more  easily 
understood  and  used  than  restructuring  languages  which  require  the  user  to 
master  a special  set  of  restructuring  operations.  Furthermore,  the 
descriptive  approach  taken  by  APSL  has  considerable  flexibility.  It 
permits  the  specifications  for  a complex  restructuring  transformation  to  be 
Structured  in  exactly  the  same  way  as  the  specifications  for  what  might 
be  considered  more  elementary  transformations.  Since  this  fundamental 
structure  is  basically  quite  simple,  we  feel  justified  in  stating  that  AP$L 
is  a verifiably  powerful,  but  structurally  simple,  restructuring  language 
which  lays  claim  to  a reasonable  degree  of  user-friendliness. 


APSL  is  a block-structured  language.  Figure  2-2  shows  the  typical 
nesting  of  blocks.  At  the  outermost  level  is  the  TARGET  RECORD  statement. 

An  APSL  description  (that  is,  a complete  set  of  APSL  statements  for  a 
particular  restructuring  transformation)  contains  one  TARGET  RECORD  state- 
ment for  each  target  record  or  relation  type.  Each  TARGET  RECORD  statement 
is  made  up  of  one  or  more  ACCESS  PATH  statements.  From  the  network  perspec- 
tive, APSL  assumes  that  each  target  record  instance  is  represented  by 
data  which  is  contained  in  an  instance  of  some  hierarchical  substructure 
of  the  source  database.  This  substructure  need  not  be  a strict  subschema 
of  the  source  data  structure;  it  may  contain  unravelled  loops.  These  sub- 
structures are  described  by  ACCESS  PATH  statements.  From  the  relational 
perspective,  ACCESS  PATH  statements  are  used  to  describe  the  joins  which 
are  to  be  computed  in  order  to  create  target  tuples.  A target  relation 
Is  the  union  of  all  the  tuples  created  according  to  the  ACCESS  PATH  speci- 
fications given  for  it. 

Each  ACCESS  PATH  statement  consists  of  one  or  more  SOURCE  RECORD 
statements.  From  the  network  point  of  view,  SOURCE  RECORD  statements  de- 
scribe the  nodes  of  the  hierarchies  from  which  target  data  is  to  be  re- 
trieved; and  from  the  relational  point  of  view,  they  describe  the  relations 
which  take  part  in  join  operations.  At  the  innermost  level  of  APSL  struc- 
ture are  the  ITEM  statements.  In  network-oriented  restructuring,  their  • 
task  is  to  specify  the  correspondence  between  target  data  items  and  the 
source  data  items  which  represent  them,  and  to  specify  selection  criteria 
which  distinguish  valid  or  desirable  instances  of  a hierarchy  from  invalid 
or  undesirable  ones.  In  relational  restructuring,  they  are  used  to  insure 
that  target  tuples  are  created  only  from  source  tuples  with  appropriately 
matching  join  fields,  to  specify  selection  operations,  and  to  specify  the 
correspondence  between  target  domains  and 'source  domains,  i.e.,  projection 
operations. 

Syntax  details  and  a moderately  sized  example  apDear  in  the  next 
section.  Notice  that  we  have  delineated  APSL’s  specification  of  the  four 
basic  relational  algebra  operations.  It  should  be  clear  from  Section  3 
that  they  are  adequate  to  specify  any. sequence  of  joins,  selections,  pro- 
jections, and  unions,  in  that  order,  when  applied  to  the  source  data  in  its 
relational  form  as  seen  through  the  RIM.  By  Deppe's  theorem,  we  conclude 
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TARGET  RECORD 
ACCESS  PATH 

SOURCE  RECORD 

ITEM 


SOURCE  RECORD 


ACCESS  PATH 


TARGET  RECORD 


EOF 


Figure  2-2 

APSE  Block  Structure 
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thst  any  restructuring  transformation  which  can  be  performed  by  relational 
algebra  operators  acting  on  the  source  data  in  its  relational  form  can  be 
specified  in  APSL. 

In  addition,  APSL's  descriptive  power  extends  to  certain  transformations 
which  are  not  readily  describable  in  relational  terms.  These  include  such 
network-oriented  operations  as  creating  indexing  sets,  bypassing  noces  on 
hierarchies,  and  altering  the  implementations  of  many-to-many  relationships. 
Examples  of  the  last  two  appear  in  the  Example  of  Section  3. 

Finally,  practical  considerations  dictate  that  even  though  we  have 
relatively  little  field  experience  in  full-scale  data  translation,  our 
restructuring  systems  must  give  as  much  thought  to  execution  efficiency 
as  possible.  Restructuring  is  an  inherently  time-consuming  process  which 
must  create  a complete  target  database  "from  scratch",  and  in  order  to 
obtain  the  necessary  data,  must  traverse  the  entire  source  database  at 
least  once.  In  fact,  given  the  current  state  of  the  art,  parts  of  the  source 
database  will  be  traversed  many  times.  We  have  found  that  in  some  real- 
world  cases,  however,  a large  portion  (70Z  or  more)  of  the  source  data  remains 
unaltered  between  source  and  target.  In  such  cases,  much  processing  time 
is  spent  simply  copying  this  unchanging  data.  Using  an  interesting 
synthesis  of  the  relational  and  network  perspectives  afforded  by  the  RIM, 
we  have  developed  an  APSL  feature  known  as  partial  restructuring.  It 
permits  the  user  to  identify  the  unchanging  portion  (if  any)  of  his  database. 
Since  complete  representation  of  all  the  unchanged  target  records  and  the 
sets  interrelating  them  already  exist  in  the  source  database,  they  are 
not  represented  in  the  restructuring  system's  target  internal  form  database, 
but  are  written  directly  from  the  internal  source  database  into  the  user's 
target  database.  In  this  manner,  substantial  savings  in  restructuring 
processor  time  and  internal  temporary  storage  use  can  be  achieved. 

The  synthesis  of  the  network  and  relational  viewpoints  that  we  used 
to  develop  partial  restructuring  is  very  simple.  The  problems  posed  by 
partial  restructuring  are  most  acute  with  network  databases. in  which  it  can 
be  extremely  difficult  to  maintain  the  sets  which  connect  the  unchanging 
portion  of  the  database  to  the  restructured  portion  [ R1 ] . In  particular, 
the  only  definition  of  "unchanging  portion"  which  we  find  adequate  is  stated 
In  relational  terms:  those  record  types  whose  source  and  target  RIM  repre- 
sentations are  identical  are  defined  as  unchanging  records,  and  the  sets  of 
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the  unchanging  portion  are  those  which  interrelate  unchanging  records. 

Thus  the  relational  viewpoint  provided  by  the  RIM  was  used  to  solve  a network 
restructuring  problem:  the  identification  of  unchanging  subnetworks. 

In  sunniary,  our  approach  to  restructuring  is  character! zed  by: 

A.  its  data  model  — the  RIM,  which  permits  a database  to  be  viewed 
simultaneously  as  a network  database  and  a relational  database  in 
First  Normal  Form. 

B.  its  restructuring  system — based  on  a simple,  efficient  algorithm 
which  can  perform  all  the  operations  of  the  relational  algebra,  but 
is  not  restricted  to  any  set  of  data  model -dependent  operations. 

C.  its  specifications  language — APSL,  a verifiably  powerful,  struc- 
turally simple,  descriptive  language.  It  is  used  to  specify 
source  structures  from  which  target  data  is  to  be  retrieved, 
rather  than  operations  which  convert  source  schema  structures  to 
target  schema  structures. 


3.0  APSL  SYNTAX  AND  DETAILED  EXAMPLES 

Section  3.1  contains  a complete  specification  of  APSL  syntax.  Detailed 
semantic  rules  may  be  found  in  [DT12],  Section  3.2  illustrates  the  APSL 
specification  of  some  of  the  restructuring  operations  cited  in  Section  2. 

The  Appendix  contains  an  example  of  a complete  restructuring  specification. 


3.1  APSL  Syntax 
Notation: 

1 APSL  reserved  words  appear  in  capital  letters;  user-determined 
words  in  lower  case. 

2.  Square  brackets  ([  ) indicate  that  the  contents  of  the  brackets 

must  occur  at  least  m times  and  no  more  than  n times.  If  the 
upper  bound  is  an  “n"  rather  than  an  integer,  the  contents  of 

the  brackets  may  repeat  arbitrarily  often.  Square  brackets  with 
no  bounds  ([  ])  indicate  that  their  contents  are  optional. 

3.  Braces  /^option.  indicate  that  exactly  one  of  the  options  must  be 


chosen. 


option^ 


option. 


APSL  Statements: 

A.  TARGET  RECORD  Statement  for  Restructured  Records 
TARGET  RECORD  target-record-name 

[ACCESS  PATH  Statement]" 

B.  ACCESS  PATH  Statement 

ACCESS  PATH  access-path-id 

/'integer^ 

[target-item-name  ={  float  ) ]n 

^literal J u 

[SOURCE  RECORD  Statement]? 
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C.  SOURCE  RECORD  Statement 

SOURCE  RECORD  source-record-narie 
ACCESS  VIA  source-set-name 

[FROM  ID  = parent-node-identifier  KhembER/SneD  ^ 
[ITEM  Statement]" 

[BLOCK  ASSIGNMENT  Statement]^ 


f rri\ 


D.  ITEM  Statement 


EQ 

NE 

PT 

source-item-name  [SELECT  IE(q£ 

LE 

Ui/ 


> VALUE  Statement]" 


[WHEN  QUALIFIED  BY  routine-name^ (VALUE  Statement)]]" 


ASSIGN  TO  target- item-name 


[CONVERT  WITH  routine-name.] 
^integer-  *■ 


nnteger 

[NULL  VALUE  float  ) 
literal  J 


] 


E. VALUE  Statement 

[integer 
J float 

(-ER0M  source- record-name-i 
isource-i tern-name  I rm  -j  . ■ . - i I 

^ L [ID=identifier]  J 


F.  BLOCK  ASSIGNMENT  Statement 


ACTUAL  DATA  IN  ORDER 
SET-SIGNIFICANT  DATA  BY 
OTHER  DATA  BY  NAME 
ALL  DATA  BY  NAME 
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6.  TARGET  RECORD  Statement  for  Unchanging  Records 


TARGET  RECORD  target-record-name  IS 

SOURCE  RECORD  source- record-name 
SET  target-set-name  IS  source-set-name 

target-item-name  IS  source-i tern-name 

[rintegery 
NULL  VALUE  -{float  \ 
literal  JJ 


3 . 2 APSL  E x amp 1 es 

Figure  3-1  shows  a small  RIM  database.  It  is  drawn  as  a Bachman 
diagram  (network  view)  augmented  to  show  set-significant  items  in  dotted 
boxes  and  primary  keys,  which  are  underlined  (relational  view).  It  is 
a modified  subset  of  a larger  database  (see  Appendix)  which  represents 
information  concerning  a mansion,  its  contents  and  the  families  living 
nearby.  The  subset  we  have  chosen  for  Figure  3-1  represents  neighboring 
houses  (NEIGH30RS  records),  the  PEOPLE  who  LIVE-THERE,  and  the  AUTOs  they 
own.  Ownership  of  cars  is  implemented  by  a link  record,  since  joint  owner- 
ship is  possible.  Purely  for  the  purposes  of  this  example,  BOSS  records 
are  included.  These  give  the  name  of  each  person  holding  a managerial 
position,  and  the  company  he  or  she  works  for.  Finally,  we  have  the  ROOMS 
of  the  mansion  and  the  LAMPS  and  FURNITURE  they  contain. 

Figures  3-2  through  3-6  show  fragments  of  target  databases  which  can 
be  derived  from  the  data  of  Figure  3-1  using  APSL.  They  illustrate  some 
of  the  major  restructuring  transformation  discussed  in  Section  2, 

3.2.1  Join,  Project  ion 

The  PEOPLE  relation  of  Figure  3-2  is  the  result  of  joining  the  source 
relations  NEIGHBORS  and  PEOPLE  on  the  deed  number  domains,  then  projecting 
onto  the  name,  age,  and  address  domains.  Notice  that  this  join  is  repre- 
sented in  the  source  data  by  the  set  LIVES-THERE,  APSL  statements  describing 
the  target  PEOPLE  record  type  follow. 
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furnished-with 


I NAME<  LAMPS>  INSURANCE-  CANDLE-POWER 


FURNITURE 

[ NAME<FURN'>  INSURANCE^  TYPE  SIZE 


Figure  3-1 
Source  Data 
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ALL-PEOPLE 


AGE 

NAME  I AGE  NET-WORTH 


POOR-SISTERS 

POOR-SISTER 


I NAME < POOR'  I NAME  NET-WORTH 


Figure  3-4 


YOUNGUNS 

YOUNG-BOSS 


NAME  AGE  COMPANY 


Figure  3-2 

Figure  3-3 

CTsystem^ 

dSYSTEM^> 

/all-people 

/ ALL-NEIGHBORS 

w ocnni  c 

r NEIGHBOR 

HOUSE-VALUE  ADDR 


DEED^CARS'  | LIC-NO  MAKE  YEAR  VALUE 


Fiaure  3-5 


1.  target  record  people 

2.  ACCESS  PATH  PEOPLE-BUILDER 

3.  SOURCE  RECORD  NEIGHBOR  ACCESS  VIA  ALL-NEIGHBORS 

4.  ADDR  ASSIGN  TO  ADDRESS 

5.  SOURCE  RECORD  PEOPLE  ACCESS  VIA  LIVE-THERE 

6.  NAME  ASSIGN  TO  NAME 

7.  AGE  ASSIGN  TO  AGE 

From  the  relational  point  of  view  PEOPLE-BUILDER  instructs  the  restruc- 
turing system  to  retrieve  all  NEIGHBOR  tuples,  and  for  each  particular 
NEIGHBOR  tuple,  retrieve  all  the  PEOPLE  tuoles  whose  DEED- <L I VE^-  field 
matches  the  NEIGHBOR'S  DEED?  field  (i.e.,  all  PEOPLE  related  to  the  NEIGH3CR 
along  LIVE-THERE).  From  each  such  NEIGHBOR-PEOPLE  pair,  ADDR  from  the 
NEIGHBOR,  and  NAME  and  AGE  from  the  PEOPLE  are  to  be  projected  onto  ADDRESS, 
NAME,  and  AGE  in  the  target  PEOPLE  record. 

All  projections  are  specified  in  this  way,  i.e.,  by  ITEM  statements 
giving  the  source-domain-to-target-domain  correspondence,  or  by  BLOCK 
ASSIGNMENT  statements  which  specify  multiple  correspondences  simultaneously. 

All  joins  which  are  represented  by  source  sets  are  specified  as  above; 
i.e.,  by  simply  instructing  the  restructuring  system  to  traverse  the 
appropriate  set(s).  Joins  not  represented  by  source  sets  are  specified  by 
describing  access  to  each  of  the  relations  to  be  joined,  and  using  selection 
criteria  to  guarantee  matching  join  fields.  This  is  illustrated  in  Section  3.2. 

From  the  network  perspective,  an  ACCESS  PATH  describes  a hierarchical 
substructure  of  the  source  data.  Each  node  of  the  hierarchy  is  described 
by  a SOURCE  RECORD  statement.  The  set  by  which  it  is  to  be  reached  from 
its  parent  node  is  specified  in  the  ACCESS  VIA  clause.  APSL  regards  all 
sets  as  two-way  sets;  passage  from  parent  node  to  child  node  may  be  either 
a member- to- owner  or  owner-to-member  access.  Since  RIM  sets  cannot  have 
multiple  owner  or  member  types,  the  basic  form  of  the  SOURCE  RECORD  statement 
completely  determines  parent  node,  with  two  exceptions.  First,  there  may 
by  more  than  one  node  on  the  hierarchy  containing  an  instance  of  the  parent 
node  record  type.  In  this  case,  the  FROM  clause  resolves  the  ambiguity. 

Second,  if  the  set  has  the  same  owner  and  member  record  type,  the  direction 
of  access  is  not  determined.  The  C^MBER/OviDER^  ,rust  be  oivon.  During 
restructuring,  each  instance  of  the  hierarchy  is  located,  and  a target 
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record  Instance  is  created  from  it  with  item  values  as  specified  by  the 
ITEM  statements  on  the  ACCESS  PATH. 

3.2.2  Select ions.  More  Joins 

Figure  3-3  shows  a YOUNG-BOSS  relation,  which  is  computed  by  joining 
the  source  relations  PEOPLE  and  BOSS  on  NAME,  and  for  all  such  tuples  with 
the  AGE  <30,  projecting  NAME,  AGE,  and  COMPANY  into  a target  tuple. 

The  desired  join  is  not  represented  explicitly  in  the  source  data, 
so  access  directions  for  BOSS  and  PEOPLE  are  given,  and  the  selection  on 
line  7 checks  to  see  that  a YOUNG-BOSS  is  created  only  from  a BOSS  and  a 
PEOPLE  with  matching  NAVES. 

We  adopt  the  following  abbreviations: 

TR  for  TARGET  RECORD 
AP  for  ACCESS  PATH 
SR  for  SOURCE  RECORD 
AV  for  ACCESS  VIA 
AT  for  ASSIGN  TO 

1.  TR  YOUNG-BOSS 

2.  AP  YOUNG-ONE 

3.  SR  BOSS  AV  ALL-BOSSES 

4.  COMPANY  ASSIGN  TO  COMPANY 

5.  SR  NEIGHBOR  AV  ALL-NEIGHBORS 

6.  SR  PEOPLE  AV  LIVE-THERE 

7.  NAME  SELECT  IF  EQ  NAME  FROM  BOSS  AT  NAME 

8.  AGE  SELECT  IF  LT  30  AT  AGE 

A target  record  (or  tuple)  is  created  form  an  instance  of  a source 
hierarchy  (or  set  of  source  tuples)  only  if  all  the  selection  criteria 
specified  for  the  ACCESS  PATH  are  satisfied. 

3.2.3  Unions 

The  relational  algebra  operation  of  Union  is  easily  specified  in 
APSL;  a target  relation  is  the  union  of  all  tuples  created  from  all  the 
ACCESS  PATHs  specified  for  that  relation.  For  example,  in  Figure  3-4  the 
POOR-SISTER  relation  is  the  union  of  POOR-SISTERs  related  through  the  mother 
and  those  through  the  father.  The  two  possibilities  are  accounted  for 
by  the  POOR-MOM  and  POOR- DAD  ACCESS  PATHS  respectively. 
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relational  algebra.  In  addition,  a synthesis  of  the  network  ana  relational 
perspectives  is  possible.  To  some  extent,  the  user  has  the  "best  of  both 
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1.  TR  PEOPLE 

2 . AP  PEOPLE 

3.  SR  NEIGHBOR  AV  ALL-NEIGHBORS 

4.  SR  PEOPLE  AV  LIVE-THERE 

5.  NAME  AT  NAME 

6.  AGE  AT  AGE 

7.  NET-WORTH  AT  NET-WORTH 

8.  TR  POOR-SISTER 

9.  AP  POOR-MOM 


10. 

11. 

12. 

13. 

14. 


15. 


16. 

17. 

18. 

19. 

20. 
21. 
22. 


23. 

24. 

25. 

26. 


SR  NEIGHBOR  AV  ALL-NEIGHBORS 

SR  PEOPLE  ID=MOM  AV  LIVE-THERE 
SR  PEOPLE  ID-KID  AV  MOTHER-OF  FROM  MOM  0WNER/MEM3ER 
NAME  AT  NAME<PCOR> 

SR  PEOPLE  I D=S IS  AV  MOTHER-OF  FROM  MOM  OWNER/ MEM3ER 
NAME  SELECT  IF  NE  NAME  FROM  PEOPLE  ID=KID  AT  NAME 
NET-WORTH  SELECT  IF  LE  2000  AT  NET-WORTH 
SEX  SELECT  IF  EQ  'FEMALE' 

AP  POOR- DAD 

SR  NEIGHBOR  AV  ALL-NEIGHBORS 

SR  PEOPLE  I D=DAD  AV  LIVE-THERE 

SR  PEOPLE  ID=KID  AV  FATHER-OF  FROM  DAD  OWNER/MEMBER 
NAME  AT  NAME<POOR> 

SR  PEOPLE  I D=  SIS  AV  FATHER-OF  FROM  DAD  OWNER/MEMBER 
NAME  SELECT  IF  NE  NAME  FROM  PEOPLE  ID=KID  AT  NAME 
NET-WORTH  SELECT  IF  LE  2000  AT  NET-WORTH 
SEX  SELECT  IF  EQ  'FEMALE' 


3.2.4  Network-Oriented  Operations 

Suppose  performance  considerations  require  frequent  fast  access  from 
a NEIGHBOR  record  to  all  the  AUTOs  which  belong  to  PEOPLE  who  live  in  the 
house,  and  suppose  that  no  car  is  jointly  owned  by  PEOPLE  living  in  different 
houses.  Then  the  CARS  set  of  Figure  3-5  is  a possible  solution.  It  could 
be  implemented  as  follows: 
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1.  TR  NEIGHBOR 

2.  AP  NEIGH 


SR  NEIGHBOR  AV  ALL-NEIGH30RS 
ACTUAL  DATA  IN  ORDER 


5.  TR  AUTO 


AP  AUTO-BUILDER 

SR  NEIGHBOR  AV  ALL-NEIGHBORS 
DEED#  AT  DEED=<CARS> 

SR  PEOPLE  AV  LIVE-THERE 

SR  POSSESSION-LINK  AV  OWNS 
SR  AUTO  AV  OWNED-BY 

ACTUAL  DATA  IN  ORDER 


Another  frequently  occurrina  network  restructuring  transformation 
involves  changing  the  implementation  of  many-to-many  relationships.  In 
Figure  3-6,  ownership  of  cars  is  represented  by  an  AUTO  record  for  each 
owner.  This  may  involve  duplicate  AUTO  data,  of  course.  We  will  assume 
that  PEOPLE  is  created  according  to  the  PEOPLE-BUILDER  ACCESS  PATH  of 
Section  3.2.1. 


1.  TR  CAR 


AP  CAR- FACTORY 

SR  AUTO  AV  ALL- CARS 

ACTUAL  DATA  IN  ORDER 
SR  POSSESS  ION- LINK  AV  OWNED-BY 
NAME<OWNS>AT  NAME<0WNS> 


3.2.5  Partial  Restructuring 

Suppose  that  the  ROOM,  LAMP,  and  FURNITURE  record  types  are  not  to  be 
changed  between  source  and  target.  Then  APSL  allows  rather  simple  specifi- 
cations for  them: 

1.  TR  ROOM  IS  SR  ROOM 

2.  TR  LAMP  IS  SR  LAMP 

3.  TR  FURNITURE  IS  SR  FURNITURE 

Partial  restructuring  specifications  may  be  somewhat  more  complex  when 
sets  connect  the  unchanging  portion  of  the  database  to  the  restructured 
portion.  (See  appendix). 
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4.0  CONCLUSIONS 

We  have  presented  the  salient  features  of  an  approach  to  network 
database  restructuring  which  we  feel  has  considerable  power  and  generality. 
These  include  the  RIM  data  model,  which  permits  a database  to  be  viewed 
simultaneously  as  a network  and  as  a relational  database  in  first  normal 
form,  and  a restructuring  system  which  can  perform  all  the  operations 
of  the  relational  algebra,  but  is  not  restricted  to  any  set  of  data- 
model-dependent  operations.  We  have  described  a powerful,  but  structur- 
ally simple  language,  APSL,  for  the  specification  of  restructuring  trans- 
formations, and  we  have  illustrated  its  use  in  describing  both  relational- 
and  network-oriented  operations.  APSL  has  been  implemented  as  part  of 
the  Michigan  Data  Translator  [DT  ]. 

Finally,  it  is  important  to  observe  that  although  APSL  was  developed 
as  a restructuring  specifications  language,  it  has  potential  applications 
outside  the  restructuring  area.  For  example,  since  APSL  descriptions  are 
not  based  on  sequences  of  operations  which  manipulate  the  constructs 
of  a particular  data  model,  it  may  prove  useful  in  specifying  high-level 
DML  operations  in  the  area  of  database  aoplication  program  description, 
validation,  and  translation. 
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APPENDIX  A 
A DETAILED  EXAMPLE. 

Figure  A-l  shows  a Bachman  diagram  for  a database  schema,  augmented 
to  include  set-significant  items  (in  dotted  boxes)  and  primary  key  items 
(underlined).  For  ease  of  reference.  Figure  A-2  shows  its  records  and 
sets.  The  database  is  a thoroughly  contrived  one  which  might  be  used 
as  an  aid  in  the  upkeep  of  a large  mansion.  Its  rather  bizarre  semantics 
are  surma ri zed. 

1.  ROOM  one  for  each  room  of  the  mansion,  including  the  three 

garages.  Rooms  are  identified  by  their  names.  Also 
stored  are  the  rooms'  dimensions  and  functions. 

2.  CLOSET  one  for  each  closet  in  the  house.  A closet  number  is 

unique  over  all  closets  in  the  mansion.  Also  known  are 
the  closet's  location  (i.e.,  north  wall,  west  wall, 
etc.),  dimensions,  and  door  type  (i.e.,  sliding,  folding, 
swinging,  etc.). 

i 3.  VALUABLES  one  for  each  object  worth  at  least  $500  currently  stored 
i In  a closet.  Also  given  are  a description  of  the  valuable 

j (TYPE)  and  its  (appraised)  VALUE. 

4.  POSSESSION-  a link  record  used  to  represent  ownership  (perhaps  joint) 

LINK  of  valuables  and  vehicles. 

5.  PEOPLE  one  for  each  resident  of  the  mansion  and  each  neighbor. 

Roles  of  items  should  be  apparent. 

6.  NEIGHBORS  one  for  each  neighboring  household  (the  area  is  zoned 

for  one-family  dwellings,  so  this  is  equivalent  to  one 
for  each  nearby  house).  Rules  of  items  should  be  apparent. 

7.  PRIME-USER  a link  record  used  to  represent  who  uses  which  room  most 

frequently.  The  cook  will  be  the  prime  user  of  the 
kitchen,  the  aging  patriarch  the  prime  user  of  the  study, 
etc.  Some  rooms,  such  as  the  dining  room,  living  room, 
and  billiard  room  will  have  several  prime  users.  Neighbors 
are  forbidden  to  be  p; ine  users  of  rooms.  Thus,  a PEOPLE 
record  which  owns  a USES  set  cannot  be  a member  of 
LIVES-THERE  and  vice  versa. 
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8.  AUTO 


9.  LOCATION- 
LINK 


10.  LAMP 


11.  FURNITURE 


12.  CARPETING 


13.  WINDOW 


one  for  each  car  owned  by  mansion  residents  or  neighbors. 
Roles  of  items  should  be  apparent.  AUTO  records  for 
cars  belonging  to  mansion  residents  are  members  of  the 
CARS  set  owned  by  the  garage  in  which  they  are  kept. 

a link  record  used  to  establish  past  and  present  locations 
of  the  mansion's  furnishings  (LIGHTING,  FURNISHING, 

FLOORS,  and  WINDOWS  are  mutually  exclusive  sets).  Current- 
flag  is  1 if  the  object  is  now  in  the  ROOM,  0 otherwise. 
Last-date  has  no  meaning  if  Current-flag  is  1.  If  current- 
flag  is  0,  Last-date  is  the  date  the  object  was  last 
moved  out  of  the  ROOM. 

one  for  each  piece  of  lighting  equipment  in  the  mansion. 
Burn-flag  indicates  whether  or  not  bulb(s)  in  the  lamp 
are  burned  out. 

one  for  each  piece  of  furniture  in  the  mansion.  Repair 
flag  is  1 if  the  object  needs  repair,  9 if  not.  If 
Repair-flag  = 1,  Repair-Desc  contains  a description  (in 
256  characters  or  less)  of  the  needed  repairs. 

one  for  each  piece  of  floor  covering  in  the  house.  Mend- 
flag  and  Mending-Desc  work  the  same  way  as  Repair-flag 
and  Repair-Desc  for  furniture. 

one  for  each  of  the  mansion’s  windows.  They  are  identified 
by  window  numbers  taken  from  the  mansion's  blueprints, 
since  insurance  numbers  were  not  assioned.  (ben-flag 
indicates  whether  or  not  the  window  can  be  opened,  storm- 
flag  whether  or  not  it  takes  a storm  window,  and  breakage- 
flag  whether  or  not  the  window  is  currently  broken.  If 
the  window  is  broken  and  a neighbor  is  at  fault,  CULPRIT 
contains  the  ADDR  of  the  household  to  which  a bill  will 
be  sent.  CULPRIT  is  blank  if  the  guilty  party  is  unknown, 
not  human,  or  a resident  of  the  mansion.  Observe  that 
members  of  the  WINDOWS  set  always  have  Current-flag  = 1. 
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The  bulk  of  the  database  is  taken  up  by  a great  many  of  the  large 
LAMP,  FURNITURE,  CARP.  TING,  and  WINDOW  records.  These  are  to  be  left 
unchanged  by  a restructuring  transformation  whose  primary  purpose  is  to 
separate  the  residents  of  the  mansion  from  their  neighbors  (US  and  THEM, 
respectively).  Figure  A-3  shows  the  desired  target  database;  Figure  A-4, 
I its  records  and  sets.  Several  other  changes  are  to  be  made  as  well, 
notably  the  addition  of  OLD-RICH-UNCLE  records.  An  instance  of  the  BUCKS 
set  consists  of  an  US  record  as  owner,  and  as  members,  one  OLD-RICH-UNCLE 
record  for  each  of  that  person's  uncles  who  is  at  least  65  and  worth  at 
least  $500,000. 

APSL  statements  describing  this  restructuring  transformation, 
together  with  some  explanatory  comments,  make  up  the  remainder  of  this 
appendix. 


27 


L 


The  bulk  of  the  database  is  taken  up  by  a great  many  of  the  large 
LAMP,  FURNITURE,  CARPETING,  and  WINDOW  records.  These  are  to  be  left 
unchanged  by  a restructuring  transformation  whose  primary  purpose  is  to 
separate  the  residents  of  the  mansion  from  their  neighbors  (US  and  THEM, 
respectively).  Figure  A-3  shows  the  desired  target  database;  Figure  A-4, 
Its  records  and  sets.  Several  other  changes  are  to  be  made  as  well, 
notably  the  addition  of  OLD-RICH-UNCLE  records.  An  instance  of  the  E'JCKS 
set  consists  of  an  US  record  as  owner,  and  as  members , one  OLD-RICH-UNCLE 
record  for  each  of  that  person's  uncles  who  is  at  least  65  and  worth  at 
least  $500,000. 

APSL  statements  describing  this  restructuring  transformation, 
together  with  some  explanatory  comments,  make  up  the  remainder  of  this 
appendix. 
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“arslon  Database  (Source 


Figure  A-l  (continued) 
ansion  Database  (Unchanoinq  Portion) 


0.  /*  MANSION  DATABASE  RESTRUCTURING  APSLDESCRIPTION  */ 

1.  MACRO  SR  LITERALLY  'SOURCE  RECORD' 

2.  MACRO  TR  LITERALLY  'TARGET  RECORD' 

3.  MACRO  AV  LITERALLY  'ACCESS  VIA' 

4.  MACRO  AT  LITERALLY  ' ASSIGN  TO' 

4.5  MACRO  AP  LITERALLY  'ACCESS  PATH' 

Renarks 

— Comments  begin  with  /*  and  end  with  */ 

— APSL  macro  facility  is  limited  to  the  form  MACRO  --Word>  LITERALLY 
'<APSL  Reserved  Uord>‘ 

5.  /*  FIRST.  THE  UNCHANGING  PORTION  */ 

6.  TARGET  RECORD  l CCATICN-L INN  IS  SOURCE  RECORD  LOCATION-LINK 

7.  SET  HAS- CONTAINED  IS  HAS-CONTAI NED 

8.  NA.UE<HAS-CON>  IS  NAME* HAS-CON' 

9.  TARGET  RECORD  LAMP  IS  SOURCE  RECORD  LAMP 

10.  TARGET  RECORD  FURNITURE  IS  SOURCE  RECORD  FURNITURE 

11.  TARGET  RECORD  CARPETING  IS  SOURCE  RECORD  CARPETING 

12.  TARGET  RECORD  WINDOW  IS  SOURCE  RECORD  WINDOW’ 

Remark  s 

—The  SET. . .specification  of  lines  7 and  8 is  required  because  HAS- 
CONTAJNED  is  owned  by  the  restructured  record  type  ROOM. 

13.  TR  ROOM 

14.  ROOMIE 

15.  SR  ROOM  AV  ALL-ROOMS 

16.  MAKE  AT  NAME 

17.  FUNCTION  AT  FUNCTION 

18.  DIMS  AT  BTUS  CONVFRT  WITH  HEAT 

Rema  rk_s 

--HEAT  is  the  name  of  a routine  which  accepts  the  dimensions  of  a room  as 
input  and  produces  the  number  of  BTUs  required  to  heat  the  room  1°  C in 

1 minute. 

19.  TR  CLOSET 

20.  AP  CLOSET 

21.  SR  ROOM  AV  ALL -ROOMS 

22.  NAME  AT  NAME<STORE> 

23.  SR  CLOSET  AV  HAS-CLOS 

24.  NUMBER  AT  NUMBER 

25.  LOCATION  AT  LOCATION 

26.  DIMS  AT  DIMS 

Remarks 

--From  the  relational  viewpoint,  this  AP  describes  the  projection  of 
the  RIM  relation 

CLOSET  (NAME- HAS',  NUMBER,  LOCATION,  DIMS,  DOOR-TYPE) 
onto  its  first  four  fields. 
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27.  IR  VALUABLES 

28.  AP  LUX 

29.  SR  ROOM  AV  ALL- ROOMS 

30.  NAME  AT  NAME  - STORED' 

3?.  SR  CLOSET  AV  l-'AS-CLOS 

32.  SR  VALUABLES  AV  CONTAINS 

33.  ACTUAL  DATA  IN  ORDER 

Remarks 


— From  the  relational  viewpoint,  this  AP  describes  the  ,ioin  of 
CLOSET  (NAfff  - HAS> . NUMBER.  LOCATION,  DIMS . DOOR-TYPE)  and  VALUAB 
(NUMBER< CONTAINS* , INSURANCE#,  TYPE,  VALUE)  on  the  NUMBER  fields 
followed  Dv  projection  onto  the  l.AI ' L - HAS  - , INSURANCE  •; , TYPE,  and 
VALUE  fields. 


34. 

TR  O'JR-STUfF 

35. 

AP  OURS 

36. 

SR  ROOM  AV  ALL -ROOMS 

37. 

SR  CLOSET  AV  liAS-CLOS 

38. 

SR  VALUABLES  AV  CONTAINS 

39. 

INSURANCE ; AT  INS- -O', .NEB 

40. 

SR  POSSESSION-!  ! NR  AV 

OWNED-BY 

41. 

NAME -CANS'  AT  NA"E-( 

INNS' 

42. 

SS DOWNS'  AT  SS--  -O’..1: 

1S> 

43. 

TR  PRIME-USER 

44. 

AT 

’ PRIME 

45. 

SR  ROOM  AV  ALL-ROOMS 

46. 

SR  PRIME-USER  AV  USED-BY 

47. 

ALL  DATA  BY  NAME 

48 

TR  US 

49. 

AP 

WE 

50. 

SR  ROOM  AV  ALL- ROOMS 

51. 

SR  PRIME-USER  AV  USED-BY 

52. 

SR  PEOPLE  ID  PERSON  AV  USES 

53. 

NAME  AT  NAME 

54. 

SS*  AT  SS= 

55. 

AOE  AT  AGE 

56. 

NET-NORTH  AT  NET-WORTH 

57. 

SEX  AT  SEX 

53. 

SR  PEOPLE  ID  MOM  AV  MOTHER-OT 

. FROM  PERSON  MEMBER/OWNER 

59. 

ACCEPT  IF  NULL 

60. 

NAME  AT  NAME-: MOTHER' 

61. 

SS£  AT  SS* -MOTHER'  NULL  VALUE  = 

62. 

SR  PEOPLL  ID-DAD  AV  TATHER-QF 

TROM  PERSON  MEMBFR/OWf, 

63. 

ACCFPT  IT  NULL 

64. 

NAME  AT  NAME-rATHTR' 

65. 

SS # AT  SS#'FATHER'NULL  VALUE'O 

LES 
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Remarks 

--Observe  the  use  of  identifiers  in  distinguishing  anong  t he  three 
appearances  of  PEOPLE  on  this  tree. 

— Since  every  resident  of  the  mansion  is  the  prime  user  of  a bedroom, 
and  only  residents  of  the  mansion  can  be  prime  users  of  rrons,  this 
AP  completely  describes  the  construction  of  target  US  records. 

— Note  further  that  since  a person  may  be  a prime  user  of  several 
rooms,  a restructuring  system  driven  by  this  APSE  description  will 
create  and  discard  some  duplicate  US  record  instances. 

--ACCEPT  IF  NULL  is  used  to  prevent  the  loss  of  PEOPLE  records  which 
are  not  members  of  the  FATHER-OF  and/or  V.QTHER-OF  sets. 

— MEMBER/OW.NER  is  usea  to  guarantee  that  a person's  parents,  rather 
than  his  or  her  children,  are  recorded  as  MOTHER  and  FATHER. 


64.  /*  THEM  RECORDS  ARE  SIMILAR  TO  US  RECOILS  */ 

65.  /*  EXCEPT  THAT  THEY  ARE  FOUND  IN  THE  SOURCE  */ 

66.  /*  BY  PASSING  THROUGH  NEIGHBORS  */ 

67.  TR  THEM 

68.  AP  THEY 

69.  SR  NEIGHBORS  AV  ALL-NEIGHBORS 

70.  ADDR  AT  ADDR<LI VE> 

71.  SR  PEOPLE  ID-PERSON  AV  LIVE-THERE 

72.  NAME  AT  NAME 

73.  SS#  AT  SS= 

74.  AGE  AT  AGE 

75.  OESC  AT  DFSC 

76.  SEX  AT  SEX 

77.  SR  PEOPLE  ID=MOM  AV  MOTHER-OF 

FROM  PERSON  MEMBER/OWNER 

78.  ACCEPT  IF  NULL 

79.  NAME  AT  MOTHER-NAME 

NULL  VALUE  = 1 ‘UNKNOWN* ‘ 

80.  SR  PEOPLE  1 0=  DAO  AV  FATHER-OF 

FROM  PERSON  MEM3ER/0XNER 

81.  ACCEPT  IF  NULL 

82.  NAME  AT  FATHER-NAME 

NULL  VALUE  = '*UNKNOWN*' 

83.  TR  NEARBY- HOUSE 

84.  AP  NEAR 

85. '  ST  NEIGHBORS  AV  ALL-NErGHRORS 

86.  ACTUAL  DATA  IN  ORDER 

87.  /*  NOW  FOR  THE  CARS  */ 

88.  TR  OUR- AUTO 

89.  AP  OURCAR 


36 


Jill  >11  .■ 


90.  SR  ROOM  AV  ALL-ROOMS 

91.  SR  AUTO  AV  CARS 

92.  ACTUAL  iv \ r a 111  ORDER 

93.  SR  POSSESS I ON-L I NK  AV  REG1STERED-T0 

94.  NAi-*.E<o;:ns>  at  name<reg-to> 

95.  SSiKOKNS>  AT  SS?kREG-TO> 

Remarks 

— Observe  that  the  choice  of  primary  key  for  OUR- AUTO  implies  that 


residents  of  the  mansion  do  not  share  ownership  of  their  cars. 

96.  TR  THE I R-CAR 

97.  AP  THL'IR  CAR 

98.  SR  NEIGHBORS  AV  ALL-NEIGHBORS 

99.  ADOR  AT  ADDR- CARS' 

100.  SR  PEOPLE  AV  LIVE- THERE 

101.  sr  possession-link  av  owns 

102.  SR  AUTO  AV  P.fGISTERED-TO 

103.  ACTUAL  DAI  A IN  ORDER 

104.  TR  POSSESSION-LINK 

105.  AP  POSSE 

106.  SR  NEIGHBORS  AV  ALL-NEIGHBORS 

107.  SR  PEOPLE  AV  LIVT-THERE 

108.  SR  POSSESSION-LINK  AV  OWNS 

109.  LIC-NO- REG  • AT  LIC-NO-OUNER-IS> 

110.  NAKE<OW!IS  AT  NAME<OWNER-OF> 

111.  SS#<OWNS-AT  SSr-OWNER-OE- 

112.  /*  LAST  BUT  CERTAINLY  NOT  LEAST,  THE  OLD  */ 

113.  /*  RICH  UNCLES  */ 

114.  TR  OLD-RICH-UNCLES 

115. 

116. 

117. 

118. 

119. 

120. 

121. 

122. 

123. 

124. 

125. 

126. 

127. 

128. 

129. 

130. 


AP  ORUNCMM 

SR  ROOM  AV  ALL-ROOMS 

FUNCITON  SELECT  IF  EQ  ’BEDROOM' 

SR  PRIME-USER  AV  IISED-BY 
SR  PEOPLE  ID  GRANDMA  AV  USES 
SR  PEOPLE  ID- UUC  AV  MOTHER-OF 

FROM  GRANDMA  OWNER/MEMBER 
SEX  SELECT  IF  L9  ’MALE’ 

AGE  SELECT  IF  GE  65 

NET-WORTH  SELECT  IF  GE  500000  AT  WORTH 
NAME  AT  NAME 

SR  PEOPLC  ID-MOM  AV  MOTHER-OF 

FROM  GRANDMA  OWNER/MEMBER 
SR  PEOPLE  ID  KID  AV  MOTHER-OF 
FROM  MOM  OWNER/MEMBER 
NAME  AT  NAME- BUCKS- 
SS#  AT  SS#<BUCKS> 

AP  ORUNCMF 

SR  ROOM  AV  ALL -ROOMS 


i 


V. 
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131. 

132. 

133. 

134. 

136. 

136. 

137. 

138. 
136. 

MO. 


Ml. 

142. 

M3. 

144. 

146. 

146. 

147. 

148. 
144. 

160. 

161. 

167. 

153. 
164. 

155. 

156. 

167. 

158. 

154. 
160. 
161. 
167. 

163. 

164. 


ruNcnoN  si  i r ct  ii  eq  'diproom* 

SR  PRIM.  OS  I R AV  USIP-PY 
SR  PIOPll  IP  ORANPMA  AV  USIS 
SR  PI  Ol'l  l ID  UNO  AV  Mill  HI  R 01 
1 ROM  GRAN!  M.A  01.WR  lUM.NK 
S!  X SI  I SO!  11  IQ  ' MAI  I ‘ 

AO  I SI  1 10 1 II  O.i  65 
HI  I I.’ORIH  SI  l ! I I II  0.1  600000 
A I WORTH 
NAMi  A1  RAM! 

SR  non!  ID  .'AO  AV  MOTHER  01 
I ROM  CiKANi  ",  O'..',:  R Ml  H.N  R 
NAMI  Slim  II  V.  MAM! 

iRCMitom  ip  lira' 
sr  in  on  i id  rip  av  i Aim  r oi 

I ROM  DAP  0'..:.,  M.-.i  K 

NAMI  A1  NAM.!  DOORS  • 

SS»  A I SS5-  DOORS- 

AH  ORONO  I I 

SR  ROOM  AV  A!  I RO  'MS 

I UNO!  ION  SI!  I 01  i<  LQ  ’Dt OROOfO 
SR  PRIM:  USIR  AV  OSIP  DA 

SR  I'lOHl  I IP  .s  ... . A AV  OSI  S 
SR  HI  ORl  IIP. AV  I AIM  R 01 
I ROM  ORAN.  PA  OR  .1  R Ml  MIN  R 
SI  X SI  1 i 0 1 0 I Q 1 MAI  I 1 
AON  SI  l 101  II  ON  t<6 
Nil  1J0R1I1  SIIIOI  II  ON  600000 
A1  WORM 
NAMI  AT  NAMI 

SR  HU  U I ID  PAD  AV  1 ATIll  R 01 
I ROM  OiRAflPP  A OUNI  R Ml  M:N  R 
NAMI  SI  I 10!  II  M NAMI  I ROM 
HI  OHl  I 10  UNO 

sr  non i ip  rio  av  i aim i:  oi 

1 ROM  PAP  OMNI  14/Mi  MIN  K 
NAMI  A I NAM!  DOORS  ■ 

Ss  Al  SS- ■ DOORS' 

AH  ORONO  !M 

SR  ROOM  AV  Al  I - ROOMS 

I UNO  I ION  sri.l  01  II  IQ  'IN  PKOOM  ‘ 

SR  PR  I Ml  -USI  R AV  Usl  O-DV 

SR  I’lOI’l  I IP  C.RVNPPA  AV  OS!  S 
SR  IN  PIN  I IP  UNO  AV  I' AT  III.  ’i  0! 

I ROM  O.RANOi'A  OWN!  R 'll  MIN  R 


165.  SIX  SI  I IOT  II  IQ  ' MAI  I ’ 

166.  AON  SIIIOI  II  ON  66 

167.  Nil -WORTH  SIIIOI  II  ON  600000 

168.  NAMI  Al  NAMI 

169.  SR  HI  OHl  I IP  MOM  AV  I ATH!  K OJ 

I ROM  GRAN  PH  A i":’:i  R "I  MD!  R 

170.  SR  PI0P1I  11'  RIP  AV  'Mi  ill  R- 01 

I ROM  mom  o.;;a  r-Mi  min  r 

171.  NAMI  Al  NA.MI  DUORs- 

17 2.  SS#  Al  SS  DOORS- 

1 73.  /*  APS|  [>I  SCRIP!  ION  OOMJ'l  I II  V 


Al  WORTH 


I 
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